SEQUENCE LISTING 

(1) GENERALyNFORMATION 



20 



(i) 



APPMCANT 
NAME! 



STREE 



CITY: 



F. HOFFMANN-LA ROCHE AG 
Grenzacherstrasse 124 



Basle 



COUNTRYx Switzerland 



POSTAL COKE: 



CH-4002 



TELEPHONE: Q61 - 688 25 05 



FAX: 
TELEX: 



061 - 688 13 95 
962292/965542 hlr c 



(ii) TITLE OF INVENTION: 

Novel Alcohol/AldehVde Dehydrogenases 



(Hi) NUMBER OF SEQUENCES: 
(iv) COMPUTER READABLE FORM: 



12 



(A) MEDIUM TYPE: \ Floppy disk 

(B) COMPUTER: Macintosh 

(C) OPERATING SYSTEM: 

(D) SOFTWARE: MS wofd ver 5 . 1 



25 




506 



(2) INFORMATION FOR SEQ ID NO: 1 : 



10 




25 



(i) 



iQUENCE CHARACTERISTICS: 



(A\ 



LENGTH: 



1740 base pairs 



(B)\ TYPE: 



nucleic acid 



(C) \ STRANDEDNESS: double 

(D) VrOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oxydans 



DSM 4025 



(iv) FEATURE: 



FEA' 



KEY: 



CDS 



POSITIOIN: 1..1737 
SEQUENCING METHOD: E 



ATGAAACCGA CTTCGCTGCT TTGGffiCCAGT GCTGGCGCAC TTGCATTGCT 50 
20 TGCCGCACCC GCCTTTGCTC AAGTGACCCC CGTCACCGAT GAATTGCTGG 100 
CGAACCCGCC CGCTGGTGAA TGGATCA^GCT ACGGTCAGAA CCAAGAAAAC 150 



TACCGTCACT CGCCCCTGAC GCAGATCAQ!G ACTGAGAACG TCGGCCAACT 200 

GCAACTGGTC TGGGCGCGCG GCATGCAGCCVGGGCAAAGTC CAAGTCACGC 250 

CCCTGATCCA TGACGGCGTC ATGTATCTGG (CAAACCCGGG CGACGTGATC 300 

30 CAGGCCATCG ACGCCAAAAC TGGCGATCTG ATCTGGGAAC ACCGCCGCCA 350 

ACTGCCGAAC ATCGCCACGC TGAACAGCTT TGOCGAGCCG ACCCGCGGCA 400 

TGGCGCTGTA CGGCACCAAC GTTTACTTTG TTTQGTGGGA CAACCACCTG 450 

GTCGCCCTCG ACACCGCAAC TGGCCAAGTG ACGTTfcGACG TCGACCGCGG 500 

CCAAGGCGAA GACATGGTTT CGAACTCGTC GGGCCCOATC GTGGCAAACG 550 



66 



9 3 4.506 



# 




GCGTGaW TGCCGGTTCG ACCTCCCAAT ACTCGCCGTT CGGCTGCTTT 600 
5 G^TCGGGCC ACGACTCGGC CACCGGTGAA GAGCTGTGGC GCAACTACTT 650 
CATCCCGCQC GCTGGCGAAG AGGGTGATGA GACTTGGGGC AACGATTACG 700 
AAGCCCGTlA GATCACCGGT GCCTGGGGCC AGATCACCTA TGACCCCGTC 750 
10 ACCAACCTTG ^CCACTACGG CTCGACCGCT GTGGGTCCGG , CGTCGGAAAC 800 
CCAACGCGGC aWgGGCG GCACGCTGTA CGGCACGAAC ACCCGTTOCG 850 
^ CGGTGCGTCC TGaWgGC GAGATTGTCT GGCGTCACCA GACCCTGCCC 900 
CGCGACAACT GGGAGCAGGA ATGCACGTTC GAGATGATGG TCACCAATGT 950 
GGATGTCCAA CCCTCgVcCG AGATCGAAGG TCTGCAGTCG ATCAACCCGA 1000 
20 ACGCCGCAAC ^CGAgWt CGCGTGCTOA CCGGCGTTCC GTCCAAAACC 1050 
GGCACCATGT GGCAGTTck CGCCGAAACC GGCGAATTCC TGTGGGCCCG 1100 
25 TGATACCAAC TJUXMAAck TQATOGAATC CATCGACGAA AACGGCATCG 1150 
TGACCGTGAA CGAAGATGCG WcTCAAGG AACTGGATGT TGAATATGAC 1200 
GTCTGCCCGA CCTTCTTGGG cWcGCGAC TGGCCGTCGG CCGCACTGAA 1250 
0 CCCCGACAGC GGCATCTACT TCaVcCCGCT GAACAACGTC TCCTATCACA 1300 
TGATGGCCGT CGATCAGGAA TTCaWga TOGACGTCTA TAACACCAGC 1350 
AACGTGACCA AGCTGCCGCC CGGCaLgAT ATCATCGGTC GTATTGACGC 1400 
GATCGACATC AGCACGGGTC GTACGCTGTG GTCGGTCGAA CGTGCTGCGG 1450 
CGAACTATTC GCCCGTCTTG TCGACCG<L GCGGCGTTCT GTTCAACGGT 1500 
40 GGTACGGATC GTTACTTCCG CGCCCTCAgV CAAGAAACCG GCGAGACCCT 1550 
GTGGCAGACC CGCCTTGCAA CCG TC GCG TC WcAGGCC ATCTCTTACG 1600 
AGGTTGACGG - CATGCAATAT GTCGCCATCG LgTGGTCG TCTCAGCTAT 1650 
GGCTCGGGCC TGAACTCGGC ACTGGCTGGC Ga\cGAGTCG ACTCGACCGC 1700 



45 



CATCGGTAAC GCCGTCTACG TCTTCGCCCT GCcWtAA 



1740 



67 




r * 
"=3 



"J 



INFORMATION FOR SEQ ID NO:2: 

(i) \ SEQUENCE CHARACTERISTICS: 



10 



15 



[A) LENGTH: 
TYPE: 



1740 base pairs 



nucleic acid 



(CJy STRANDEDNESS: double 
(D)\ TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 



(iii) ORIG] 



SOURCE: 
ORGANISM: Gluconobacter oxydans 



DSM 4025 



(iv) FEATURE: 



FEA r 



POSITIO] 



iD KEY: CDS 
1..1737 



SEQUENCING METHOD: E 




20 



25 



30 



35 



ATGAAGACGT CGTCTTTGCT GGTTOKTGAGC GTTGCCGCGC TTGCAAGCTA 
TAGCTCCTTT GCGCTTGCTC AAGTgWc CGTCACCGAT GAATTGCTGG 
CGAACCCGCC CGCTGGTGAA TGGATCaVct ACGGTCAGAA CCAAGAAAAC 
TACCGTCACT CGCCCCTGAC GCAGATCAOG ACTGAGAACG TCGGCCAACT 
GCAACTGGTC TGGGCGCGCG GCATGCAGCcWcAAAGTC CAAGTCACGC 
CCCTGATCCA TGACGGCGTC ATGTATCTGG QAAACCCGGG CGACGTGATC 
CAGGCCATCG ACGCCAAAAC TGGCGATCTG At\tGGGAAC ACCGCCGCCA 
ACTGCCGAAC ATCGCCACGC TGAACAGCTT TGGG5AGCCG ACCCGCGGCA 
TGGCGCTGTA CGGCACCAAC GTTTACTTTG TTTCg\gGGA CAACCACCTG 
GTCGCCCTCG ACACCGCAAC TGGCCAAGTG ACGTTCgVcG TCGACCGCGG 

ccaaggcgaa gacatggttt cgaactcgtc gggcccga\c GTGGCAAACG 



50 



100 



150 



200 



250 



300 



350 



400 



450 



500 



550 



68 



934506 



t 

GCGTfeATCGT TGCCGGTTCG ACCTGCCAAT ACTCGCCGTT CGGCTGCTTT 600 
GTCTCGGGCC ACGACTCGGC CACCGGTGAA GAGCTGTGGC GCAACTACTT 650 

5 A 

CATCCCGCGC GCTGGCGAAG AGGGTGATGA GACTTGGGGC AACGATTACG 7Q0 

AAGCCCGTTK3 GATGACCGGC GTCTGGGGTC AGATCACCTA TGACCCCGTT 750 

10 GGCGGCCTTgVtCCACTACGG CTCGTCGGCT GTTGGCCCGG CTTCGGAAAC 800 

CCAGCGCGGC AfcCACCGGCG GCACCATGTA CGGCACCAAC ACCCGTTTCG 850 

CTGTCCGTCC CGAGACTGGC GAGATCGTCT GGCGTCACCA AACTCTGCCC 900 

15 V 

CGCGACAACT GGGAQCAAGA GTGCACCTTC GAGATGATGG TTGCCAACGT 950 
TGACGTGCAG CCCGCAGCTG ACATGGACGG CGTCCGCTCG ATCAACCCGA 1000 
20 ACGCCGCCAC CGGCGAG^GT CGCGTTCTGA CCGGCGTTCC GTGCAAAACC 1050 
GGCACCATGT GGCAGTTCQA CGCCGAAACC GGCGAATTCC TGTGGGCCCG 1100 
TGACACCAGC TACGAGAACA\TCATCGAATC GATCGACGAA AACGGCATCG 1150 

25 \ 

TGACCGTCGA CGAGTCGAAA GTTCTGACCG AGCTGGACAC CCCCTATGAC 1200 
GTCTGCCCGC TGCTGCTGGG TGC5CCGTGAC TGGCCGTCGG CTGCGCTGAA 1250 
30 CCCCGATACC GGCATCTACT TTATCCCGCT GAACAACACC TGCATGGATA 1300 
TCGAAGCTGT CGACCAGGAA TTCAGOTCGC TGGACGTGTA CAACCAAAGC 1350 
CTGACCGCCA AAATGGCACC GGGTAAAffiAG CTGGTTGGCC GTATCGACGC 1400 

35 \ 

CATCGACATC AGCACAGGCC GCACCCTGTG GACCGCTGAG CGCGAAGCCT 1450 

CGAACTACGC GCCTGTCCTG TCGACCGCTg\gCGGCGTTCT GTTCAACGGC 1500 

40 GGCACCGACC GTTACTTCCG CGCTCTCAGC CAAGAGACCG GCGAGACCCT 1550 

GTGGCAGACC CGTCTGGCGA CTGTCGCTTC GGGCCAAGCT GTCTCGTACG 1600 

AGATCGACGG -CGTCCAATAC ATCGCCATCG GCGGGGGCGG CACGACCTAT 1650 

45 A 

GGTTCGTTCC ACAACCGTCC CCTGGCCGAG CCGGTOSACT CGACCGCGAT 1700 
CGGTAATGCG ATGTACGTCT TCGCGCTGCC CCAGCAATAA 1740 

50 

69 



4 506 



# 




10 



INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1737 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
[OLECULE TYPE: DNA (genomic) 

ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oxydans 
STRAIN: DSM 4025 



(iv) FEA r 



15 



iATURE KEY: CDS 
POSITIOIN: 1..1734 
SEQUENCING METHOD: E 



20 


ATGAAACTGA 


CGACCCTGCT 


\GCAAAGCAGC 


GCCGCCCTGC 


TTGTGCTTGG 


50 


CACCATTCCC 


GCCCTTGCCC 


AAACCGCCAT 


CACCGATGAA 


ATGCTGGCGA 


100 




ACCCGCCCGC 


TGGTGAATGG 


ATCAACTACG 


GTCAGAACCA 


AGAGAACTAC 


150 


25 


CGCCACTCGC 


CCCTGACGCA 


GATTACCGCA 


GACAACGTCG 


GCCAACTGCA 


200 




ACTGGTCTGG 


GCGCGCGGTA 


TGGAAGCGGG 


CAAGATCCAA 


GTGACCCCGC 


250 


30 


TTGTCCATGA 


CGGCGTCATG 


TATCTGGCAA 


ACCCCGGTGA 


CGTGATCCAG 


300 


GCCATCGACG 


CCGCGACCGG 


CGATCTGATC 


TGGGAACACC 


GCCGCCAACT 


350 




GCCGAACATC 


GCCACGCTGA 


ACAGCTTTGG 


TGAGCCGACC 


CGCGGCATGG 


400 


35 


CCCTCTATGG 


CACCAACGTC 


TATTTCGTCTa 


CGTGGGACAA 


CCACTTGGTC 


450 




GCGCTGGACA 


CCTCGACCGG 


CCAAGTCGTA 


Vtcgacgtcg 


ATCGCGGTCA 


500 



70 



AGGCACGGAT ATGGTCTCGA ACTCGTCCGG CCCGATTGTC GCCAATGGCG 550 

TCATCCSTTGC GGGCTCGACC TGTCAGTATT CGCCGTTCGG CTGTTTCGTT 600 
5' \ 

TCGGGCOACG ACTCGGCCAC CGGTGAAGAG CTGTGGCGCA ACAACTTTAT 650 

CCCGCGCG(SC GGCGAAGAGG GTGATGAGAC CTGGGGCAAT GATTACGAGG 700 

10 CCCGCTGGaA GACCGGCGTT TGGGGCCAGA TCACCTATGA CCCCGTTGGC 750 

GGCCTTGTCC V.CTACGGCAC CTCAGCAGTT GGCCCTGCGG CCGAGATTCA 800 



15 



GCGCGGCACC GTTGGCGGCT CGATGTATGG CACCAACACC CGCTTTGCTG 850 
TCCGCCCCGA GAQjCGGCGAG ATCGTCTGGC GTCACCAAAC TCTGCCCCGC 900 
GACAACTGGG ACCAAGAGTG TACGTTCGAG ATGATGGTCG TCAACGTCGA 950 



20 CGTCCAGCCC TCGGOTGAGA TGGAAGGCCT GCACGCCATC AACCCCGATG 1000 
CCGCCACGGG CGAGCGTCGC GTTGTGACCG GCGTTCCGTG CAAGAACGGC 1050 



25 



ACCATGTGGC AGTTCGAQGC CGAAACCGGC GAATTCCTGT GGGCGCGCGA 1100 



CACCAGCTAT CAGAACC 



TCGAAAGCGT CGATCCCGAT GGTCTGGTGC 1150 



ATGTGAACGA AGATCTGGTQ GTGACCGAGC TGGAAGTGGC CTATGAAATC 1200 
30 TGCCCGACCT TCCTGGGTGG CCGCGACTGG CCGTCGGCTG CGCTGAACCC 1250 
CGATACTGGC ATCTATTTCA tCCCGCTGAA CAACGCCTGT AGCGGTATGA 1300 



35 



CGGCTGTCGA CCAAGAGTTC AGCTCGCTCG ATGTGTATAA CGTCAGCCTC 1350 
GACTATAAAC TGTCGCCCGG TTCQGAAAAC ATGGGCCGTA TCGACGCCAT 1400 
CGACATCAGC ACCGGCCGCA CGCTGTGGTC GGCTGAACGC TACGCCTCGA 1450 
40 ACTACGCGCC TGTCCTGTCC ACCGGCGGCG GCGTGCTGTT CAACGGCGGC 1500 
ACCGACCGTT ACTTCCGCGC CCTCAGCGAA GAGACCGGCG AGACGCTGTG 1550 



45 



GCAGACCCGT - CTGGCGACTG TCGCCTCGQG TCAAGCGATT TCCTATGAGA 1600 
TCGACGGCGT GCAATATGTC GCCATCGGGQ GCGGCGGCAC CAGCTATGGC 1650 
AGCAACCACA ACCGCGCCCT GACCGAGCGG VtCGACTCGA CCGCCATCGG 1700 



50 CAGCGCGATC TATGTCTTTG CTCTGCCGCA GSAGTAA 



1737 



71 




- f: 



III 

ru 



10 



15 




INFORMATION FOR SEQ ID NO:4: 

(i) \ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1740 base pairs 

TYPE: nucleic acid 



STRANDEDNESS: double 



(ii) 
(iii) 



(D)\ TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
ORIGINAL SOURCE: 

►RGANISM: Gluconobacter oxydans 
STRAIN: DSM 4025 



(iv) FEATURED 



FEA r 
POSITK 



KEY: 



CDS 



: 1..1737 



SEQUENCING METHOD: E 



20 


ATGAACCCCA 


CAACGCTGCT 


TCGGACCAGC 




CGCGCCCGCC 


GCATTCGCGC 


AGGTAACCCC 




CGAACCCGCC 


CGCTGGTGAA 


TGGAT'I(AACT 


3 5 


TATCGCCACT 


CGCCCCTGAC 


CCAGATCACT 




GCAACTGGTC 


TGGGCCCGCG 


GGATGGAGGC 


30 


CGATGATCCA 


TGATGGCGTG 


ATGTATCTGG 




CAGGCGCTGG 


ATGCGCAAAC 


AGGCGATCTG 




ACTGCCCGCC 


GTCGCCACGC 


TAAACGCCCA 


35 










TCGCCCTTTA 


CGGCACGAGC 


CTCTATTTCA 




ATCGCGCTGG 


ATATGGAGAC 


GGGCCAGGTC 



50 



CT ACGGCCGCAA CCAAGAAAAC 150 



GGGGGCCGTA CAGGTCACGC 250 



'CTGGGAAC ACCGCCGCCA 350 



72 



934506 



ATCGGGCGAA GACGGCTTGA CCAGTAACAC CACGGGGCCG ATTGTCGCCA 550 

ATGGCGTCAT CGTCGCGGGT TCCACCTGCC AATATTCGCC CTATGGATGC 600 

5 \ 

TTTATCTCGG\GGCACGATTC CGCGACGGGT GAGGAGCTGT GGCGCAACCA 650 

CTTTATCCCG C&GCCGGGCG AAGAGGGTGA CGAGACTTGG GGCAATGATT 700 

10 TCGAGGCGCG CTGGATGACC GGCGTCTGGG GTCAGATCAC CTATGATCCC 750 

GTGACGAACC TTGTOTTCTA TGGCTCGACC GGCGTGGGCC CAGCGTCCGA 800 



15 



AACCCAGCGC GGCACGQCGG GCGGCACGCT GTATGGCACC AACACCCGCT 850 
TTGCGGTGCG TCCCGACA(SG GGCGAGATTG TCTGGCGTCA CCAGACCCTG 900 
CCGCGCGACA ACTGGGACcA AGAATGCACG TTCGAGATGA TGGTCGCCAA 950 



20 CGTCGATGTG CAACCCTCGG SCGAGATGGA GGGTCTGCGC GCCATCAACC 1000 
CCAATGCGGC GACGGGCGAG CGfcCGTGTGC TGACGGGTGC GCCTTGCAAG 1050 



25 



ACCGGCACGA TGTGGTCGTT TGATGCGGCC TCGGGCGAAT TCCTGTGGGC 1100 

GCGTGATACC AACTACACCA ATATGATCGC CTCGATCGAC GAGACCGGCC 1150 

TTGTGACGGT GAACGAGGAT GCGGTGCTGA AAGAGCTGGA CGTTGAATAT 1200 

30 GACGTCTGCC CGACCTTCCT GGGTGGGCQC GACTGGTCGT CAGCCGCACT 1250 

GAACCCGGAC ACCGGCATTT ACTTCTTGCCXGCTGAACAAT GCCTGCTACG 1300 



35 



ATATTATGGC CGTTGATCAA GAGTTTAGCG OGCTCGACGT CTATAACACC 1350 



AGCGCGACCG CAAAACTCGC GCCGGGCTTT Gj 



lTATGG GCCGCATCGA 1400 



CGCGATTGAT ATCAGCACCG GGCGCACCTT GTGGTCGGCG GAGCGCCCTG 1450 
40 CGGCGAACTA CTCGCCCGTT TTGTCGACGG CAGGCGGTGT GGTGTTCAAC 1500 



GGCGGGACCG ACCGCTATTT CCGTGCCCTC AGCCA( 



CCGGCGAGAC 1550 



45 



TTTGTGGCAG - GCCCGTCTTG CGACGGTCGC GACGGGGCAG GCGATCAGCT 1600 



ACGAGTTGGA CGGCGTGCAA TATATCGCCA TCGGTGCi 



CGGTCTGACC 1650 



TATGGCACGC AATTGAACGC GCCGCTGGCC GAGGCAATCG \ATTCGACCTC 1700 



50 GGTCGGTAAT GCGATCTATG TCTTTGCACT GCCGCAGTAA 



1740 



73 



06 



INFORMATION FOR SEQ ID NO:5 : 

(i) \ SEQUENCE CHARACTERISTICS: 



iA) LENGTH: 
TYPE: 



579 residues 



amino acid 



(Q 



TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(iii) ORIGINAL SOURCE: 



IRGANISM: Gluconobacter oxydans 

DSM 4025 



10 



15 



20 



(iv) FEATURE^ 

FEA' 



KEY: sig peptide 



POSITION: 



-23..-1 



SEQUENCING METHOD: E 



FEA r 



KEY: mat peptide 



POSITION: 



1..556 



25 



SEQUENCING METHOD: E 



Met Lys Pro Thr Ser Leu Lem Trp Ala Ser Ala Gly Ala Leu Ala 

-20 \ -15 -10 



Leu Leu Ala Ala Pro Ala Phe Ala Gin Val Thr Pro Val Thr Asp 
30 -5 \ 1 5 

Glu Leu Leu Ala Asn Pro Pro k^a. Gly Glu Trp lie Ser Tyr Gly 

10 15 20 

35 Gin Asn Gin Glu Asn Tyr Arg HisYser Pro Leu Thr Gin lie Thr 

25 30 \ 35 



40 



Thr Glu Asn Val Gly Gin Leu Gin Leu Val Trp Ala Arg Gly Met 

40 45 \ 50 

Gin Pro Gly Lys Val Gin Val Thr ProYeu lie His Asp Gly Val 

55 -60 \ 65 



74 



# 



Met Tyr \eu Ala Asn Pro Gly Asp Val lie Gin Ala lie Asp Ala 

JO 75 80 

5 Lys Thr G3W Asp Leu lie Trp Glu His Arg Arg Gin Leu Pro Asn 

85 90 95 



10 



lie Ala Thr\Leu Asn Ser Phe Gly Glu Pro Thr Arg Gly Met Ala 

100 \ 105 110 

Leu Tyr Gly Thr Asn Val Tyr Phe Val Ser Trp Asp Asn His Leu 

115 \ 120 125 



Val Ala Leu AspY hr Ala Thr Gl Y Gln Val Thr phe As P Val As P 
15 130 \ 135 140 



Arg Gly Gin Gly 

145 



Gl\i Asp Met Val Ser Asn Ser Ser Gly Pro lie 

150 155 



o 



20 Val Ala Asn Gly Val lie Val Ala Gly Ser Thr Cys Gin Tyr Ser 

160 \ 165 170 



25 



Pro Phe Gly Cys Phe Val Ser Gly His Asp Ser Ala Thr Gly Glu 

175 \ 180 185 

Glu Leu Trp Arg Asn Tyr ^?he lie Pro Arg Ala Gly Glu Glu, Gly 

190 \ 195 200 



Asp Glu Thr Trp Gly Asn Asp Tyr Glu Ala Arg Trp Met Thr Gly 
30 205 \ 210 215 




Ala Trp Gly Gin lie Thr Tyr Asp Pro Val Thr Asn Leu Val His 

220 2\25 230 

35 Tyr Gly Ser Thr Ala Val Gly Pro Ala Ser Glu Thr Gin Arg Gly 

235 240\ 245 

Thr Pro Gly Gly Thr Leu Tyr Gly ¥hr Asn Thr Arg Phe Ala Val 

250 255 \ 260 

Arg Pro Asp Thr Gly Glu lie Val Trfo Arg His Gin Thr Leu Pro 

265 270 A 275 



Arg Asp Asn Trp Asp Gin Glu Cys Thr Phe Glu Met Met Val Thr 
45 280 285 \ 290 

Asn Val Asp Val Gin Pro Ser Thr Glu MeV Glu Gly Leu Gin Ser 

295 300 \ 305 

50 lie Asn Pro Asn Ala Ala Thr Gly Glu Arg Arg Val Leu Thr Gly 

310 315 \ 320 



75 



934506 



Val P^o Cys Lys Thr Gly Thr Met Trp Gin Phe Asp Ala Glu Thr 

325 330 335 

5 Gly Glu\Phe Leu Trp Ala Arg Asp Thr Asn Tyr Gin Asn Met lie 

40 345 350 



Glu Ser life Asp Glu Asn Gly lie Val Thr Val Asn Glu Asp Ala 
10 355\ 360 365 

lie Leu Lys Glu Leu Asp Val Glu Tyr Asp Val Cys Pro Thr Phe 

370 \ 375 380 



15 Leu Gly Gly ArV Asp Trp Pro Ser Ala Ala Leu Asn Pro Asp Ser 

385 \ 390 395 



20 



Gly lie Tyr Phe ule Pro Leu Asn Asn Val Cys Tyr Asp Met Met 

400 \ 405 410 

Ala Val Asp Gin Glu Phe Thr Ser Met Asp Val Tyr Asn Thr Ser 

415 \ 420 425 



Asn Val Thr Lys 
25 430 



Leu\ Pro Pro Gly Lys Asp Met lie Gly Arg lie 

435 440 




Asp Ala lie Asp lie Ster Thr Gly Arg Thr Leu Trp Ser Val Glu 

445 \ 450 455 

30 Arg Ala Ala Ala Asn Tyr\ Ser Pro Val Leu Ser Thr Gly Gly Gly 

460 \ 465 470 

Val Leu Phe Asn Gly Gly ^hr Asp Arg Tyr Phe Arg Ala Leu Ser 

475 \ 480 485 



Gin Glu Thr Gly Glu Thr Ldu Trp Gin Thr Arg Leu Ala Thr Val 

490 \ 495 500 

Ala Ser Gly Gin Ala lie Ser\Tyr Glu Val Asp Gly Met Gin Tyr 
40 505 £10 515 

Val Ala lie Ala Gly Gly Gly Val Ser Tyr Gly Ser Gly Leu Asn 

520 525 530 



45 Ser Ala Leu Ala Gly Glu Arg Val\ Asp Ser Thr Ala lie Gly Asn 

535 540\ 545 



50 



Ala Val Tyr Val Phe Ala Leu Pro Qln 

550 555 



76 



4 -5 0 § 




INFORMATION FOR SEQ ID NO:6: 



10 



15 



20 



25 



40 



SEQUENCE CHARACTERISTICS: 



LENGTH: 
TYPE: 



amino acid 



TOPOLOGY: linear 



OLECULETYPE: protein 
ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oxydans 



STRAIN: 



ATURE KEY: sig peptide 



POSITION: 



SEQUENCING METHOD: S 




579 residues 



DSM 4025 



-23. .-1 



KEY: mat peptide 



POSITION: 1..556 



SEQUENCING METHOD: S 



Met Lys Thr Ser Ser Leu Leu Val Ala Ser Val Ala Ala Leu Ala 

-20 \ -15 -10 



Ser Tyr Ser Ser Phe Ala LeVi Ala Gin Val Thr Pro Val Thr Asp 

30 -5 \ 1 5 

Glu Leu Leu Ala Asn Pro Pro ^la Gly Glu Trp lie Ser Tyr Gly 

10 \15 20 

35 Gin Asn Gin Glu Asn Tyr Arg His Ser Pro Leu Thr Gin lie Thr 

25 iO 35 



Thr Glu Asn Val Gly Gin Leu Gift Leu Val Trp Ala Arg Gly Met 

40 45\ 50 

Gin Pro Gly Lys Val Gin Val Thr \ro Leu He His Asp Gly Val 

55 -60 \ 65 



77 




M\t Tyr Leu Ala Asn Pro Gly Asp Val lie Gin Ala lie Asp Ala 

70 75 80 

5 Lys Thr Gly Asp Leu lie Trp Glu His Arg Arg Gin Leu Pro Asn 

85 90 95 



10 



lie Ala \hr Leu Asn Ser Phe Gly Glu Pro Thr Arg Gly Met Ala 

0 105 110 

Leu Tyr Gly\Thr Asn Val Tyr Phe Val Ser Trp Asp Asn His Leu 

115 \ 120 125 



Val Ala Leu Asto Thr Ala Thr Gly Gin Val Thr Phe Asp Val Asp 
15 130 \ 135 -140 



Arg Gly Gin Gly Slu Asp Met Val Ser Asn Ser Ser Gly Pro lie 

145 \ 150 155 



20 Val Ala Asn Gly 

160 



Val\lle Val Ala Gly Ser Thr Cys Gin Tyr Ser 

165 170 



25 



Pro Phe Gly Cys Phe Va\L Ser Gly His Asp Ser Ala Thr Gly Glu 

175 \ 180 185 

Glu Leu Trp Arg Asn Tyr Mie lie Pro Arg Ala Gly Glu Glu Gly 

190 \ 195 200 




Asp Glu Thr Trp Gly Asn Asp\Tyr Glu Ala Arg Trp Met Thr Gly 

30 205 \10 215 

Val Trp Gly Gin lie Thr Tyr AsK Pro Val Gly Gly Leu Val His 

220 225\ 230 

35 Tyr Gly Ser Ser Ala Val Gly Pro A\a Ser Glu Thr Gin Arg Gly 

235 240 \ 245 

Thr Thr Gly Gly Thr Met Tyr Gly Thr Asn Thr Arg Phe Ala Val 

250 255 \ 260 



Arg Pro Glu Thr Gly Glu He Val Trp Arg\His Gin Thr Leu Pro 

265 270 \ 275 

Arg Asp Asn Trp Asp Gin Glu Cys Thr Phe Gl\ Met Met Val Ala 
45 280 285 \ 290 



Asn Val Asp Val Gin Pro Ala Ala Asp Met Asp 

295 300 



Val Arg Ser 
305 



50 He Asn Pro Asn Ala Ala Thr Gly Glu Arg Arg Val ^eu Thr Gly 

310 315 320 



78 



1 Pro Cys Lys Thr Gly Thr Met Trp Gin Phe Asp Ala Glu Thr 
325 330 335 

5 Gly Y}lu Phe Leu Trp Ala Arg Asp Thr Ser Tyr Glu Asn He He 

340 345 350 



Glu Se\ 



10 



He Asp Glu Asn Gly He Val Thr Val Asp Glu Ser Lys 
355 360 365 



Val Leu Thr Glu Leu Asp Thr Pro Tyr Asp Val Cys Pro Leu Leu 

0 375 380 



Leu Gly Gift Arg Asp Trp Pro Ser Ala Ala Leu Asn Pro Asp Thr 
15 385\ 390 395 

Gly He Tyr Phe He Pro Leu Asn Asn Thr Cys Met Asp He Glu 

400 \ 405 410 



20 Ala Val Asp Gin 

415 



Phe Ser Ser Leu Asp Val Tyr Asn Gin Ser 

420 425 



25 



Leu Thr Ala Lys M& Ala Pro Gly Lys Glu Leu Val Gly Arg He 

430 \ 435 440 

Asp Ala He Asp He «er Thr Gly Arg Thr Leu Trp Thr Ala Glu 

445 \ 450 455 



Arg Glu Ala Ser Asn TyV Ala Pro Val Leu Ser Thr Ala Gly Gly 
30 460 \ 465 470 

Val Leu Phe Asn Gly Gly \hr Asp Arg Tyr Phe Arg Ala Leu Ser 

475 \ 480 485 

35 Gin Glu Thr Gly Glu Thr Leuv Trp Gin Thr Arg Leu Ala Thr Val 

490 \ 4 95 500 



40 



Ala Ser Gly Gin Ala Val Ser TVr Glu He Asp Gly Val Gin Tyr 

505 5M 515 

He Ala He Gly Gly Gly Gly Thr\Thr Tyr Gly Ser Phe His Asn 

520 525 \ 530 



Arg Pro Leu Ala Glu Pro Val Asp Sar Thr Ala He Gly Asn Ala 
43 535 540 



545 



Met Tyr Val Phe Ala Leu Pro Gin Gin 

550 555 



50 



79 




10 



15 



20 



25 




40 



INFORMATION FOR SEQ ID NO:7: 



SEQUENCE CHARACTERISTICS: 



LENGTH: 
TYPE: 



578 residues 



amino acid 



TOPOLOGY: linear 



MOLECULE TYPE: protein 
ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oxydans 



["RAIN: 



DSM 4025 



URE KEY: sig peptide 



POSITION: 



-23..-1 



CING METHOD: S 




POSITION 



KEY: mat peptide 
1..555 



SEQUENCING METHOD: S 



Met Lys Leu Thr Thr Leu Leu\Gln Ser Ser Ala Ala Leu Leu Val 

-20 \ -15 -10 



Leu Gly Thr lie Pro Ala Leu Ala Gin Thr Ala lie Thr Asp Glu 
30 -5 \ 1 5 

Met Leu Ala Asn Pro Pro Ala Gly\Glu Trp lie Asn Tyr Gly Gin 

10 15 \ 20 

35 Asn Gin Glu Asn Tyr Arg His Ser p\o Leu Thr Gin lie Thr Ala 

25 30 \ 35 



Asp Asn Val Gly Gin Leu Gin Leu 

40 45 



Trp Ala Arg Gly Met Glu 

50 



Ala Gly Lys lie Gin Val Thr Pro Leu Vkl His Asp Gly Val Met 

55 .60 \ 65 



80 



93*506 



# 




f 1 I "" 



Leu Ala Asn Pro Gly Asp Val lie Gin Ala lie Asp Ala Ala 
70 75 80 

5 Thr VLy Asp Leu lie Trp Glu His Arg Arg Gin Leu Pro Asn lie 

85 90 95 



10 



Ala Thr\Leu Asn Ser Phe Gly Glu Pro Thr Arg Gly Met Ala Leu 

,00 105 110 

Tyr Gly Thr Asn Val Tyr Phe Val Ser Trp Asp Asn His Leu Val 

115 120 125 



Ala Leu Asp Vhr Ser Thr Gly Gin Val Val Phe Asp Val Asp Arg 
15 130 \ 135 140 

Gly Gin Gly Ti\ Asp Met Val Ser Asn Ser Ser Gly Pro lie Val 

145 \ 150 155 

20 Ala Asn Gly Val Die Val Ala Gly Ser Thr Cys Gin Tyr Ser Pro 

160 \ 165 170 



25 



Phe Gly Cys Phe Val\Ser Gly His Asp Ser Ala Thr Gly Glu Glu 

175 \ 180 185 

Leu Trp Arg Asn Asn PHse lie Pro Arg Ala Gly Glu Glu Gly Asp 

190 \ 195 200 




Glu Thr Trp Gly Asn Asp Hyr Glu Ala Arg Trp Met Thr Gly Val 
30 205 \ 210 215 

Trp Gly Gin He Thr Tyr Asp\Pro Val Gly Gly Leu Val His Tyr 

220 \25 230 

35 Gly Thr Ser Ala Val Gly Pro Ala Ala Glu He Gin Arg Gly Thr 

235 240v 245 

Val Gly Gly Ser Met Tyr Gly Thr \sn Thr Arg Phe Ala Val Arg 

250 255 \ 260 

Pro Glu Thr Gly Glu He Val Trp Arg\ His Gin Thr Leu Pro Arg 

265 270 \ 275 

Asp Asn Trp Asp Gin Glu Cys Thr Phe G£u Met Met Val Val Asn 
45 280 285 \ 290 

Val Asp Val Gin Pro Ser Ala Glu Met Glu Oly Leu Hi s Ala He 

295 300 \ 305 

50 Asn Pro Asp Ala Ala Thr Gly Glu Arg Arg ValWal Thr Gly Val 

310 315 \ 320 



81 



93^500 



Pr6\ Cys Lys Asn Gly Thr Met Trp Gin Phe Asp Ala Glu Thr Gly 

325 330 335 

Glu Phe Leu Trp Ala Arg Asp Thr Ser Tyr Gin Asn Leu lie Glu 

340 345 350 



Ser Val \\sp Pro Asp Gly Leu Val His Val Asn Glu Asp Leu Val 

55 360 365 

Val Thr Gl\ Leu Glu Val Ala Tyr Glu lie Cys Pro Thr Phe Leu 

370\ 375 380 



Gly Gly Arg 

385 



t sp Trp Pro Ser Ala Ala Leu Asn Pro Asp Thr Gly 

390 395 



lie Tyr Phe Ile\ Pro Leu Asn Asn Ala Cys Ser Gly Met Thr Ala 

400 \ 405 410 

Val Asp Gin Glu Phe Ser Ser Leu Asp Val Tyr Asn Val Ser Leu 

415 \ 420 425 

Asp Tyr Lys Leu Ser\Pro Gly Ser Glu Asn Met Gly Arg lie Asp 

430 \ 435 440 

Ala lie Asp lie Ser Thr Gly Arg Thr Leu Trp Ser Ala Glu Arg 

445 \ 450 455 

Tyr Ala Ser Asn Tyr Ala\Pro Val Leu Ser Thr Gly Gly Gly Val 

460 \ 465 470 



Leu Phe Asn Gly Gly Thr 

475 



Asb Arg Tyr Phe Arg Ala Leu Ser Gin 
480 485 



Glu Thr Gly Glu Thr Leu Trp Gin Thr Arg Leu Ala Thr Val Ala 

490 495 500 

Ser Gly Gin Ala He Ser Tyr GlYi He Asp Gly Val Gin Tyr Val 

505 510. 515 



Ala He Gly Arg Gly Gly Thr Ser Wr Gly Ser Asn His Asn Arg 

520 525 \ 530 

Ala Leu Thr Glu Arg He Asp Ser Tttr Ala He Gly Ser Ala He 

535 540 \ 545 

Tyr Val Phe Ala Leu Pro Gin Gin 

550 555 



82 



# 




10 



15 



20 



25 



40 



INFORMATION FOR SEQ ID NO:8: 



SEQUENCE CHARACTERISTICS: 



LENGTH: 



TYPE: 



amino acid 



TOPOLOGY: linear 
OLECULETYPE: protein 
ORIGINAL SOURCE: 

ORGANISM: Gluconobacter oxydans 



STRAIN: 



DSM 4025 



FEAT 



ATURE KEY: sig peptide 



POSITION: 



-23..-1 




579 residues 



SEQUENCING METHOD: E 

KEY: mat peptide 



POSITION: 



1..556 



CING METHOD: E 



Met Asn Pro Thr Thr Leu Y eu ^9 Thr Ser Ala Ala Val Leu Leu 

-20 \ -15 -10 



Leu Thr Ala Pro Ala Ala Phe Ala Gin Val Thr Pro He Thr Asp 
30 -5 \ 1 5 

Glu Leu Leu Ala Asn Pro Pro\Ala Gly Glu Trp He Asn Tyr Gly 

10 \ 15 20 

35 Arg Asn Gin Glu Asn Tyr Arg His Ser Pro Leu Thr Gin He Thr 

25 io 35 



Ala Asp Asn Val Gly Gin Leu Gln\ Leu Val Trp Ala Arg Gly Met 

40 45\ 50 

Glu Ala Gly Ala Val Gin Val Thr Pro Met He His Asp Gly Val 

55 .60 \ 65 



83 



, j 




Met Hvr Leu Ala Asn Pro Gly Asp Val lie Gin Ala Leu Asp Ala 

70 75 80 

5 Gin Thn Gly Asp Leu lie Trp Glu His Arg Arg Gin Leu Pro Ala 

85 90 95 

Val Ala \hr Leu Asn Ala Gin Gly Asp Arg Lys Arg Gly Val Ala 

0 105 110 



10 



Leu Tyr Gly Thr Ser Leu Tyr Phe Ser Ser Trp Asp Asn His Leu 

115\ 120 125 



He Ala Leu Asp Met Glu Thr Gly Gin Val Val Phe Asp Val Glu 
15 130 \ 135 140 

Arg Gly Ser GlV Glu Asp Gly Leu Thr Ser Asn Thr Thr Gly Pro 

145 \ 150 155 



20 He Val Ala Asn Gly Val He Val Ala Gly Ser Thr Cys Gin Tyr 

160 \ 165 170 



25 



Ser Pro Tyr Gly Cy^ Phe He Ser Gly His Asp Ser Ala Thr Gly 

175 \ 180 185 

Glu Glu Leu Trp Arg\ Asn His Phe He Pro Gin Pro Gly Glu Glu 

190 \ 195 200 




Gly Asp Glu Thr Trp Gly Asn Asp Phe Glu Ala Arg Trp Met Thr 
30 205 \ 210 215 

Gly Val Trp Gly Gin IleVrhr Tyr Asp Pro Val Thr Asn Leu Val 

220 \ 225 230 

35 Phe Tyr Gly Ser Thr Gly Val Gly Pro Ala Ser Glu Thr Gin Arg 

235 \ 240 245 

Gly Thr Pro Gly Gly Thr Le^i Tyr Gly Thr Asn Thr Arg Phe Ala 

250 \ 255 260 

Val Arg Pro Asp Thr Gly Glu ttle Val Trp Arg His Gin Thr Leu 

265 110 275 

Pro Arg Asp Asn Trp Asp Gin Glu Cys Thr Phe Glu Met Met Val 
45 280 285 290 

Ala Asn Val Asp Val Gin Pro SerVAla Glu Met Glu Gly Leu Arg 

295 300 \ 305 

50 Ala He Asn Pro Asn Ala Ala Thr Gly Glu Arg Arg Val Leu Thr 

310 315 \ 320 



84 



93*506 



Gly Ma Pro Cys Lys Thr Gly Thr Met Trp Ser Phe Asp Ala Ala 

325 330 335 

5 Ser Gly\Glu Phe Leu Trp Ala Arg Asp Thr Asn Tyr Thr Asn Met 

40 345 350 



lie Ala Seft lie Asp Glu Thr Gly Leu Val Thr Val Asn Glu Asp 
10 355\ 360 365 

Ala Val Leu Dys Glu Leu Asp Val Glu Tyr Asp Val Cys Pro Thr 

370 \ 375 380 

15 Phe Leu Gly Gly\ Arg Asp Trp Ser Ser Ala Ala Leu Asn Pro Asp 

385 \ 390 395 



20 



Thr Gly lie Tyr Phe Leu Pro Leu Asn Asn Ala Cys Tyr Asp lie 

400 \ 405 410 

Met Ala Val Asp Gln\Glu Phe Ser Ala Leu Asp Val Tyr Asn Thr 

415 \ 420 425 



Ser Ala Thr Ala Lys Leu Ala Pro Gly Phe Glu Asn Met Gly Arg 
25 430 \ 435 440 



lie Asp Ala lie Asp 

445 



Ile\ Ser Thr Gly Arg Thr Leu Trp Ser Ala 

450 455 



30 Glu Arg Pro Ala Ala Asn TVr Ser Pro Val Leu Ser Thr Ala Gly 

460 A 465 470 



35 



Gly Val Val Phe Asn Gly Gly\Thr Asp Arg Tyr Phe Arg Ala Leu 

475 \80 485 

Ser Gin Glu Thr Gly Glu Thr l\bu Trp Gin Ala Arg Leu Ala Thr 

490 49\5 500 



Val Ala Thr Gly Gin Ala lie Setf Tyr Glu Leu Asp Gly Val Gin 
40 505 510\ 515 

Tyr He Ala He Gly Ala Gly Gly Vjeu Thr Tyr Gly Thr Gin Leu 

520 525 \ 530 

45 Asn Ala Pro Leu Ala Glu Ala He ASp Ser Thr Ser Val Gly Asn 

535 540 \ 545 



50 



Ala He Tyr Val Phe Ala Leu Pro 

550 555 



Gln\ 



85 




INFORMATION FOR SEQ ID NO:9: 




SEQUENCE CHARACTERISTICS: 



LENGTH: 
TYPE: 



82 bases 



nucleotide 



TOPOLOGY: linear 



LECULETYPE: DNA 
ORIGINAL SOURCE: 



synthetic oligonucleotide 



CATGAAAATA 



CAGGTG CACGCATCCT CGCATTATCC GCATTAACGA 50 



10 CGATGATGTT TTCCQCCTCG GCTCTCGCCC AG 



INFORMATION FOR SEQJD NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 



15 



(A) LENGTH: 





82 



83 bases 



(B) TYPE: 



nucleotide 



(C) TOPOLOGY: linear 



(ii) MOLECULE 



: DNA 



(iii) ORIGINAL SOURCE: 



synthetic oligonucleotide 



GTTACCTGGG CGAGAGCCGA GGCGGAAAAC ATCATCGTCG TTAATGCGGA 50 



TAATGCGAGG ATGCGTGCAC CTGTT1TTTAT TTT 



83 



86 



INFORMATION FOR SEQ ID NO: 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 residues 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) ORIGINAL SOURCE: E. coli 

(iv) FEATURE: 

FEATURE KEY: sig peptide 
POSITION: 1..26 
FEATURE METHOD: S 



Met Lys lie Lys Thr Gly Ala Arg lie Leu Ala Leu Ser Ala Leu 
1 5 10 15 

Thr Thr Met Met Phe Ser Ala Ser Ala Leu Ala Gin 

20 25 27 



INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 bases 

(B) TYPE: nucleotide 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) ORIGINAL SOURCE: synthetic oligonucleotide 



GTTAGCGCGG TGGATCCCCA TTGGAGG 27 



87 



